Discriminative Syntactic Reranking for Statistical Machine Translation

نویسنده

  • Simon Carter
چکیده

This paper describes a method that successfully exploits simple syntactic features for n-best translation candidate reranking using perceptrons. Our approach uses discriminative language modelling to rerank the nbest translations generated by a statistical machine translation system. The performance is evaluated for Arabic-to-English translation using NIST’s MT-Eval benchmarks. Whilst parse trees do not consistently help, we show how features extracted from a simple Part-ofSpeech annotation layer outperform two competitive baselines, leading to significant BLEU improvements on three different test sets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminative Reranking for Grammatical Error Correction with Statistical Machine Translation

Research on grammatical error correction has received considerable attention. For dealing with all types of errors, grammatical error correction methods that employ statistical machine translation (SMT) have been proposed in recent years. An SMT system generates candidates with scores for all candidates and selects the sentence with the highest score as the correction result. However, the 1-bes...

متن کامل

Discriminative Reranking for Machine Translation

This paper describes the application of discriminative reranking techniques to the problem of machine translation. For each sentence in the source language, we obtain from a baseline statistical machine translation system, a ranked nbest list of candidate translations in the target language. We introduce two novel perceptroninspired reranking algorithms that improve on the quality of machine tr...

متن کامل

Reranking Translation Hypotheses Using Structural Properties

We investigate methods that add syntactically motivated features to a statistical machine translation system in a reranking framework. The goal is to analyze whether shallow parsing techniques help in identifying ungrammatical hypotheses. We show that improvements are possible by utilizing supertagging, lightweight dependency analysis, a link grammar parser and a maximum-entropy based chunk par...

متن کامل

Modèle de traduction statistique à fragments enrichi par la syntaxe. (A Syntax-Augmented Phrase-Based Statistical Machine Translation Model)

Traditional Statistical Machine Translation models are not aware of linguistic structure. Thus, target lexical choices and word order are controlled only by surface-based statistics learned from the training corpus. Knowledge of linguistic structure can be beneficial since it provides generic information compensating data sparsity. The purpose of our work is to study the impact of syntactic inf...

متن کامل

A Discriminative Syntactic Word Order Model for Machine Translation

We present a global discriminative statistical word order model for machine translation. Our model combines syntactic movement and surface movement information, and is discriminatively trained to choose among possible word orders. We show that combining discriminative training with features to detect these two different kinds of movement phenomena leads to substantial improvements in word order...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010